Immune–related biomarkers shared by inflammatory bowel disease and liver cancer

It has been indicated that there is an association between inflammatory bowel disease (IBD) and hepatocellular carcinoma (HCC). However, the molecular mechanism underlying the risk of developing HCC among patients with IBD is not well understood. The current study aimed to identify shared genes and potential pathways and regulators between IBD and HCC using a system biology approach. By performing the different gene expression analyses, we identified 871 common differentially expressed genes (DEGs) between IBD and HCC. Of these, 112 genes overlapped with immune genes were subjected to subsequent bioinformatics analyses. The results revealed four hub genes (CXCL2, MMP9, SPP1 and SRC) and several other key regulators including six transcription factors (FOXC1, FOXL1, GATA2, YY1, ZNF354C and TP53) and five microRNAs (miR-124-3p, miR-34a-5p, miR-1-3p, miR-7-5p and miR-99b-5p) for these disease networks. Protein-drug interaction analysis discovered the interaction of the hub genes with 46 SRC-related and 11 MMP9- related drugs that may have a therapeutic effect on IBD and HCC. In conclusion, this study sheds light on the potential connecting mechanisms of HCC and IBD.


Introduction
Hepatocellular carcinoma (HCC), a major malignant form of the liver, is known as one of the most dangerous cancers and a leading cause of cancer deaths globally [1]. Surgical resection, transplantation and local ablation remain as standard therapeutic regimens for patients at an early stage of HCC with high overall survival (OS) rates. HCC patients with later stages, on the other hand, are usually subscribed for radio-/chemotherapies with significantly poorer outcomes [2]. These findings suggest an urgent demand for novel early diagnostic biomarkers and clinical treatment guidance for HCC patients. Generally, HCC develops in patients with cirrhosis and chronic liver inflammation, which were driven by a hepatitis virus infection, alcohol consumption, long-term smoking and non-alcoholic fatty liver-associated diseases [3][4][5][6]. Emerging evidence has demonstrated a contributing role of gut microbiomes in HCC development. Accordingly, microbial dysbiosis and leaky gut stimulate the release of microbiota-associated metabolites, remarkably contributing to hepatic inflammation, fibrosis, cell growth and anti-apoptosis signals [7,8]. Inflammatory bowel disease (IBD) is a chronic inflammation of the gastrointestinal tract, which includes Crohn's disease (CD) and ulcerative colitis (UC) [9]. Despite of being different in the clinical features, the pathogenesis of CD and UC involves the same risk factors, such as genetic susceptibility and alteration in gut microbiome and immune response [9]. Abnormal gut microbiota, for example, may contribute to intestinal inflammation and immune response dysregulation that eventually result in IBD [9][10][11]. Moreover, recent researches have demonstrated that severe IBD can lead to gastrointestinal cancers [12,13] as well as various extraintestinal manifestations including cardiovascular diseases, immune-mediated diseases [14][15][16][17] and malignancies such as cholangiocarcinoma, lymphoma, melanoma [18][19][20][21][22][23]. Remarkably, HCC risk among patients with IBD, especially among those with CD, have been repeatedly reported [21,[24][25][26]. These findings suggested a potential cross-talk in the pathophysiological pathways of IBD and HCC that needs to be further elucidated.
Despite numerous efforts on decoding the fundamental signaling molecules, the biomarkers and mutual underlying molecular pathways between HCC and IBD remained poorly understood. Therefore, in the present study, we have applied a systems biology approach to discover common potential biomarkers and the underlying mechanisms, thereby providing new insights into the pathology and clarifying the mutual immunity mechanisms of IBD and HCC.

Data query
The workflow for the current study is presented in Fig 1. To discover the differentially expressed genes (DEGs) in IBD and HCC, three microarray datasets namely GSE75214 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75214) for IBD and GSE14520 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE14520) and the Cancer Genome Atlas TCGA-LIHC (https://www.cancer.gov) for HCC were used as the input. IBD dataset (GSE75214) comprises the gene expression of biopsies from the colon of 11 controls, 97 UC and eight CD patients and from the (neo-)terminal ileum of 11 controls and 67 CD patients. The HCC dataset GSE14520 contains the gene expression of 225 HCC samples and 220 normal liver tissues and TCGA-LIHC includes 50 normal control and 374 HCC samples.

Differential gene expression analysis
Prior to the differential gene expression analyses, data were transformed using log2 function in R program (ver. 4.0.2; R Development Core Team, Vienna, Austria). Then, a principal component analysis (PCA) was conducted using prcomp function to remove outliers. The genes (probes) that were expressed in less than three samples were excluded. The Limma [27] and DESeq2 packages [28] were used to find the significant DEGs and the p-values were corrected using the False Discovery Rate (FDR) correction toolkit in the R software (ver. 4.0.2; R Development Core Team, Vienna, Austria) for GSE and TCGA data. The significant DEGs with FDR < 0.05 for GSE (fold change > 1) and TCGA (fold change > 2) data were identified. A total of 1,793 immune genes (IMGs) were attained from 17 categories from the Analysis Portal (ImmPort) website (https://www.immport.org) and Immunology Database after excluding the duplicates [29]. The common DEGs between IBD, HCC and IMGs datasets were then identified and visualized by using VennDiagram package [30]. prognostic risk score. Univariate Cox regression analysis was performed to explore the correlation between the DEGs and OS [32]. The hazard ratio (HR) of death was computed and Bonferroni adjusted p-values < 0.05 was considered statistically significant.

The protein expressions of prognostic hub genes
The Human Protein Atlas (HPA; https://www.proteinatlas.org/) is an open-access resource for human transcriptome and proteome [33]. The hub genes were validated by immunohistochemistry results in HCC and normal tissues obtained from the HPA database and a previous study [34].

The immune infiltration of prognostic hub genes
Since lymphocyte infiltration is an important indicator for lymph nodes' status and cancer survival, the association between the hub genes expression levels and the immune infiltration levels in HCC was evaluated using TIMER version 1 (TIMER: Tumor Immune Estimation Resource) [35].

Identification of DEGs-interacted transcription factors and microRNAs
To identify transcription factors (TFs) and microRNAs (miRNAs) that bind to the hub genes to regulate their expression, TF-target and miRNA-target interaction analyses were performed using two open-access databases, JASPAR [36] and MirTarbase [37], respectively, followed by a topological analysis using NetworkAnalyst [38].

Protein-drug interaction analysis
Protein-drug interaction analysis provides information about the potential interaction between drugs and the target genes [39]. To identify potential drugs from the Comparative Toxicogenomics Database (CTD) that might interact with the common DEGs, protein-drug interaction analysis was performed via NetworkAnalyst [38].

Gene-disease association analysis
The gene-disease associations by DisGeNET, which cover a wide range of biomedical characteristics of diseases, was commonly used to understand human genetic diseases [40]. The relationship between common DEGs and associated diseases was explored through NetworkAnalyst [38].

DNA methylation analysis
The UALCAN tool was used to find the correlation between DNA methylation and four hub genes. It provided information on TCGA gene expression, DNA methylation, clinical data and friendly web resource [41].

Common DEGs among HCC, IBD and IMGs
Gene expression datasets (HCC-GSE14520, HCC-TCGA and IBD-GSE75214) and IMGs list were collected in the current study. There were 9,045, 10,657, and 4,406 significant DEGs in the IBD-GSE75214, HCC-GSE14520, HCC-TCGA datasets, respectively. The Jvenn tool showed common DEGs among groups, 112 common DEGs among HCC, IBD and IMGs have been detected (Fig 2; S1 Table). These common DEGs were then subjected to further downstream analyses.

Gene ontology and pathway enrichment analysis of common DEGs
The top significantly enriched GOs and KEGG pathways were shown in Fig 3A and 3B, respectively. The GO analysis revealed that these common DEGs were significantly contributed to negative regulation of response to external stimulus, cell chemotaxis and regulation of inflammatory response under biological process ( Fig 3A). For cellular component-GOs, common DEGs were significantly involved in the external side of the plasma membrane and secretory granule lumen. Lastly, for molecular function, DEGs were mainly involved in receptor-ligand activity, signaling receptor activator activity, cytokine activity, etc. (Fig 3A). In addition, the most importantly enriched KEGG pathways included cytokine-cytokine receptor interaction, viral protein interaction with cytokine and cytokine receptors, axon guidance, IL-17 signaling pathway in cancer ( Fig 3B).

Determination of hub proteins
This PPI network analysis via STRING database is commonly used to investigate the biological responses in disease and health conditions. The PPI network of 112 common DEGs analysis revealed 20 hub genes that meet the cut-off degree > 21 ( Fig 4A; S2 Table). Six modules for potential hub genes in the PPI network with MCODE score > 2 and nodes > 3 were identified

PLOS ONE
(S1 Fig). The visualization of the main module using Cytoscape showed four hub proteins in the center, namely CXCL2, MMP9, SPP1 and SRC (Fig 4B). The topological features and involvement of the hub proteins in HCC and IBD were presented in Table 1.

The hub genes validation
The twenty hub genes from the PPI network with the highest degree were subjected to the survival analysis using univariate Cox analysis. The results exposed the significance of four hub genes (CXCL2, MMP9, SPP1 and SRC) as prognostic makers of HCC (Fig 5; S3 Table). Specifically, the increased expression levels of MMP9 (p = 0.028), SPP1 (p = 0.0001) and SRC (p = 0.032) and the decreased expression levels of CXCL2 (p = 0.026) were strongly related to poorer prognosis in HCC patients ( Fig 5).

The hub genes immunohistochemistry expression
The protein expression levels of four hub genes (CXCL2, MMP9, SPP1 and SRC) in HCC and control group was explored via the HPA database and a previous study [34]. Accordingly, protein expression levels of MMP9, SPP1 and SRC were substantially increased and CXCL2 was decreased in HCC compared to the controls (Fig 6).

The correlation between hub genes and immune cell infiltration and their methylation status in HCC
We used TIMER online tool to explore association between hub genes and six immune cell types (CD4+/CD8+ T cells, B cells, macrophages, neutrophils and dendritic cells) and tumor   In addition, gene expression and methylation analysis showed significant differences in both gene expression ( Fig 7A) and methylation patterns ( Fig 7B) of CXCL2, MMP9, SPP1 and SRC between liver tumor and normal liver tissue samples. Furthermore, a negative correlation between methylation patterns and gene expression was also noted for three genes (MMP9, SPP1 and SRC). This finding indicated that upregulation of these three hub genes might be a result of their diminished DNA methylation in HCC.

Determination of regulatory signatures
Next, a network-based approach was performed to screen for the DEG-TF, DEG-miRNA interactions, thereby detecting the potential regulatory molecules of the hub DEGs. The gene-TF and gene-miRNA networks revealed six TFs namely FOXC1, FOXL1, GATA2, YY1, ZNF354C and TP53 (Fig 8A; S4 Table) and five miRNAs namely miR-124-3p, miR-34a-5p, miR-1-3p, miR-7-5p and miR-99b-5p (Fig 8B; S5 Table) as the potential regulators of the four hub genes. The biological functions of these TFs and miRNAs in HCC are presented in Table 2.

Recognition of protein-drug and gene-disease interactions
To find drugs that can target the hub proteins, we studied protein-drug interactions and identified 46 SRC-related and 11 MMP9-related drugs (S3 Fig; S6 and S7 Tables). In addition, the gene-disease analysis showed that three out of four hub genes (CXCL2, SRC and SPP1) were also associated with mammary neoplasms, pulmonary fibrosis, dermatitis and allergic contact diseases (S4 Fig; S8 Table).

Discussion
Despite the immune system has been well recognized for its roles in regulating tumorigenesis, there are no effective molecular targets currently available in routine clinical practice since the exact mechanisms involved in its pathogenesis remain poorly understood [75][76][77]. The immunotherapy remains unclear in liver cancer; therefore, it is necessary to classify the potentially

PLOS ONE
effective patients who might be benefit from the therapy and to predict the outcomes. IBD itself alters the gut microbiome [78,79]. The harmful bacteria were then directed to portal circulation, causing abnormal expression of cell adhesion molecules, thereby increasing the risk of liver cancer [25]. Furthermore, the treatment therapies of IBD might also stimulate the HCC progression related to the impairment of the immune response [12]. In this study, we identified 112 common DEGs among IBD, HCC datasets and an immune gene list. Both GO and KEGG pathway analyses revealed a significant role of the inflammatory response in the HCC progression, in which cytokine-cytokine receptor interaction is the most common between IBD-and HCC-associated genes. In addition, this interaction was reported to be remarkably associated with HCC, supporting our observation [80,81].
The current study identified CXCL2, MMP9, SPP1 and SRC as the four hub genes among IBD, HCC and IMGs. Notably, CXCL2, a small cytokine of the CXC chemokine family, was identified as a top significant gene through PPI network, IHC staining and survival analysis. CXCL2 related to neutrophil response under various conditions such as wound healing, cancer metastasis and angiogenesis [82]. Recently, CXCL2 was reported as an inhibitor of the HCC cell cycle [34]. Exosomes containing CXCL2 or expressing CXCL2 receptors improved chemotaxis of HCC; thus, it was exploited for targeted drug delivery. Additionally, CXCL2 was also noted to be an important cytokine for IBD [43]. Particularly, CXCR2, mediated the release of neutrophils from the bone marrow via binding to its two ligands (CXCL1, CXCL2) [83].
MMP9, SPP1 and SRC, in contrast, were positive regulators of HCC cell death. In this study, SPP1 is the most significant interaction gene of HCC and IBD. Osteopontin, a protein encoded by the SPP1 gene, is up-regulated in IBD [84]. SPP1, with inference value of 129.66 from the gene-disease association dataset, is regarded as potential drug targets for the liver cancer treatment. Moreover, SPP1 promotes HCC growth and induces resistance to cell apoptosis, suggesting that SSP1 is a potential therapeutic target in HCC [46]. MMP9 plays a vital role in promoting cell migration and metastasis [85]. HCC develops as a result of a change in MMP 9 protein expression [86]. Lastly, SRC, which belongs to a group of SRC family kinases, primarily involved in the regulation of embryonic development and cell growth. A previous study showed that increased SRC expression and activity promoted cancer progression processes, including cell proliferation, differentiation, invasion and migration [87]. In HCC, SRC signaling pathway contributes to cell growth, metastasis and drug resistance via targeting ASPP2, TIGF, L-FABP, GRP78, CD47 and TM4SF5/CD44 [87]. Saracatinib, an SRC inhibitor, might improve the outcomes of liver cancer patients and have been approved by the food and drug administration organization for the treatment of HCC [88]. Moreover, the IHC staining results showed an increase of MMP9, SPP1 and SRC protein expression levels and a significant decrease in CXCL12 [34] in HCC tissues. Besides, our results showed a moderate positive

PLOS ONE
association between four hub genes and infiltration levels of macrophages and dendritic cells. These outcomes were supported by GO results, suggesting the possible regulatory role of the target genes in negative response with an external stimulus of tumor-related to those cells. Likewise, these results indicated that three hub genes can activate macrophages, resulting in increased T cell exhaustion. Moreover, this study discovered negative correlations between the gene expression levels and methylation status of three hub genes (MMP9, SPP1 and SRC). Interestingly, when the expression levels of three hub genes significantly increased, their methylation profile was significantly inhibited. In addition, the upregulation of SPP1 and SRC due to lower promoter methylation in liver carcinoma has been reported [89,90], which is in agreement with the findings of this study. Different interaction networks have been analyzed to identify the potential regulatory miR-NAs and TFs of the hub genes. Among the identified TFs, FOXC1 was considered as a novel biomarker for the early stages of HCC [53]. Foxl1 was able to induce liver repair by activating the canonical Wnt/b-catenin pathway [91]. Reduced GATA2 expression was related to a poor outcome of HCC following resection [55]. YY1 enhanced linc01134 transcription by interacts with linc01134 promoter to mediates HCC progression [92]. In addition, YY1 has been shown to be an important mediator of the mTORC1, a signaling pathway in immune cell metabolism

GATA2
Decreased expression of GATA2 was correlated with poor prognosis of HCC. [55] YY1 YY1 is upregulated in HCC cell lines, which promotes tumor progression and inhibits cell differentiation in HCC. YY1 promotes the malignancy of HCC by increasing the expression of Quaking that is associated with poor outcomes of HCC. [56-59]

ZNF354C
ZNF354C is a transcriptional repressor in HCC; patients with higher expression levels of ZNF354C exhibit a better overall survival. [60] TP53 TP53 is a well-known tumor suppressor gene that was mutated in >30% of HCC patients. HCC patients with TP53 hot-spot mutations (R249S and V157F) have poorer prognosis.
[61, 62] microRNAs miR-124-3p MiR-124-3p is a putative tumor suppressor whose expression was often reduced in HCC cells and tissues. It inhibits the proliferation, invasion and metastasis of HCC and is being considered as a novel diagnostic marker and therapeutic target for HCC [63][64][65] miR-34a-5p MiR-34a-5p was usually downregulated in liver cancer cells and tissues. Its overexpression inhibits HCC cells growth and progression, while enhances apoptosis in HCC cells.
[ [71][72][73] miR-99b-5p MiR-99b is highly expressed in HCC tissues and cell lines. Overexpression of miR-99b promotes tumor progression, migration and invasion of HCC and is associated with poor outcomes of patients with HCC. PLOS ONE [93]. ZNF354C play a vital role in the dissociation of the complex from CHD1L and BCL9 promoters to abolish the transcription inhibition, suggesting its potential target for diagnosis and treatment of HCC [60]. TP53, a tumor suppressor gene, plays a critical function for the HCC progression [94]. MiRNAs are short (~22nt) RNA molecules that directly bind to the target mRNAs and negatively regulate their expression. MiRNAs are extensively studied and being used as potential biomarkers for various human diseases, including cancers. This study identified five potential miRNAs (miR-124-3p, miR-1-3p, miR-7-5p, miR-34a-5p and miR-99b-5p) that might target and regulate the expression of four hub genes. These five miRNAs are downregulated in HCC (Table 2) [66,69,71,95]. Additionally, these five miRNAs were also suggested as the molecular signature of IBD [96][97][98][99][100][101]. For instance, miR-7-5p inhibited the expression of TFF3 in IBD [101].
Protein-drug interaction analysis is central to drug discovery and disease treatment, which contributes to understanding the mechanisms of action and potential side effects of drugs, as well as the sensitivity of the receptors [102]. The current study identified 46 SRC-related and 11 MMP9-related drugs. Additionally, the protein-disease interaction networks showed three out of five hub genes (CXCL2, SRC and SPP1) are also associated with mammary neoplasms, pulmonary fibrosis, dermatitis and allergic contact diseases. The presence of the common genes between conditions suggested that there might be a link between them [103]. However, in the present study, the identification of the hub genes was barely based on the gene expression microarray data overlapping with immune genes, which might cause some errors/biases in the outcomes. Additional experimental evidence is required to confirm the findings.

Conclusions
In summary, these results strongly indicated CXCL2, MMP9, SPP1 and SRC as key genes in IBD and HCC. The analyses of the present study identified several TFs (FOXC1, FOXL1, GATA2, YY1, ZNF354C and TP53) and miRNAs (miR-124-3p, miR-1-3p, miR-7-5p, miR-34a-5p and miR-99b-5p) that potentially regulate those key genes. These hub genes and their transcriptional and/or posttranscriptional products might be the potential therapeutic targets in connecting mechanisms of HCC and IBD.